Skip to content

Andersen Pointer Analysis#842

Draft
fabianbs96 wants to merge 39 commits into
secure-software-engineering:developmentfrom
fabianbs96:f-AndersOTFAA
Draft

Andersen Pointer Analysis#842
fabianbs96 wants to merge 39 commits into
secure-software-engineering:developmentfrom
fabianbs96:f-AndersOTFAA

Conversation

@fabianbs96
Copy link
Copy Markdown
Member

Adds AndersenOTFSolver, a context- and field-insensitive Andersen-style points-to analysis that co-refines the call graph and alias sets in a single fixpoint.
Unlike the staged pipeline (resolver → PA), the solver owns its own function-worklist loop: direct calls add callees immediately; indirect calls are resolved as pts(fp) grows.

Key features

  • On-the-fly call graph: direct, function-pointer, vtable, and
    struct-field vtable calls are all resolved during the fixpoint.
    • Online cycle detection (LCD): detects strongly-connected components
      lazily during propagation and merges them via union-find to avoid
      redundant work.
    • Delta propagation: propagates only the incremental PendingPts wave
      per node rather than recomputing full set differences, eliminating
      per-iteration posix_memalign allocations.
    • MemorySSA-guided load handling: uses MemorySSA (with BasicAA, TBAA,
      and ScopedNoAlias) to determine reaching stores for each load, allowing
      precise alias edges instead of conservative load constraints where the
      def-chain is known.
    • Library summaries: applies FDFF-based flow summaries for common libc
      functions; falls back to treating function-pointer arguments as reachable
      callbacks (soundy mode).
    • Efficient result construction: precomputes per-object alias bitmaps
      once and broadcasts via bitwise OR, replacing an O(N²) nested loop.
    • Exposed call graph: the computed LLVMBasedCallGraph is part of
      AndersenOTFResult and can be consumed by downstream analyses.

The work in this branch has been largely AI generated.
It was an experiment on how autonomous I can use claude code for coding.
I have reviewed each line of code manually. Parts needed to be rewritten by hand.

fabianbs96 and others added 30 commits April 23, 2026 19:40
…ing in AndersenOTFSolver

- grow() may reallocate Nodes; all constraint methods now call every
  grow() before indexing Nodes[X], and snapshot pts sets before any
  addAssignEdge call that fires inside a foreach callback
- onNewPointee snapshots all four constraint lists upfront for the
  same reason
- merge() snapshots NonRep vectors before any addAssignEdge call, and
  retroactively fires load/store/memcopy constraints for Rep's merged
  pts set (previously those constraints were silently dropped for
  already-existing pointees)
- ConnectKnownTargets and checkUnresolvedFPCalls snapshot pts(FPId)
  before iterating: connectCallee->propagate() can grow that set
- handleCall now collects all resolved IDs per argument (not just the
  last one) via SmallVector<ValueId,2> per slot; FPCallRecord::Args
  and connectCallee updated accordingly
- Add dedup guards (LoadDstSet, StoreSrcSet, MemCopyAs{Src,Dst}Set)
  to NodeInfo to avoid redundant constraint firing
- Remove unused NoArgId sentinel and <cstdint> include
- Mark rep() [[nodiscard]]

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix OperandOf::operator< (was comparing R2.Inst instead of R1.Inst)
- DeepChainTwoObjectsMerge (context_04_1): three-level id chain with x/y
- RecursiveSelfAlias (context_08): SCC collapsing under self-recursion
- MutualRecursionAlias (context_10_0): Forth↔Back two-way recursion
- ReturnSecondArgContextInsensitive (context_12_1): argretq precision
- FuncPtrCallbackIdentity (context_14_1): OTF resolves indirect call
- RecursionTwoObjectsMerge (context_09_0): recursive with two objects
- MutualRecursionTwoObjects (context_10_1): mutual recursion, two objects
- ThreeWayMutualRecursion (context_11_0): Forth↔Back↔Stop recursion
- ThreeArgReturnQContextInsensitive (context_13_1): three-param function
- FuncPtrCallbackThreeWayMerge (context_14_2): three function pointers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
XXX: Should we allow passing-in an instance of LLVMFunctionDataFlowFacts?
Root-cause was integral stores being found as reaching definition for a ptr-load
fabianbs96 and others added 2 commits June 4, 2026 17:48
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@fabianbs96 fabianbs96 self-assigned this Jun 4, 2026
@fabianbs96 fabianbs96 added the enhancement New feature or request label Jun 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant